Cell Genomics
○ Elsevier BV
Preprints posted in the last 30 days, ranked by how well they match Cell Genomics's content profile, based on 162 papers previously published here. The average preprint has a 0.22% match score for this journal, so anything above that is already an above-average fit.
Wang, Y.; Truong, B.; Lu, W.; Fadil, C.; He, Y.; Luo, W.; Koyama, S.; Tsuo, K.; Paruchuri, K.; Yu, Z.; Hull, L. E.; Zheng, Z.; Carey, C. E.; Walters, R. K.; Neale, B. M.; Robinson, E. B.; Kraft, P.; Natarajan, P.; Martin, A. R.
Show abstract
Polygenic scores (PGS) are typically derived from single-trait genome-wide association studies (GWAS), yet many complex diseases arise from shared genetic liability distributed across correlated clinical dimensions. Accordingly, disease risk depends not only on how genetic liability is represented but also on the social context in which that liability is expressed. Whether phenome-derived latent factors improve prediction, and how social determinants of health (SDoH) modify the realized utility of PGS, remains unclear. Here we constructed PGS for 35 orthogonal latent phenomic factors derived from 2,772 phenotypes in 361,114 UK Biobank (UKB) participants and evaluated their phenomic specificity, cross-dataset portability and predictive performance relative to conventional disease-specific PGS across the UKB holdout, Mass General Brigham Biobank and the All of Us (AoU) Research Program. Factor-based PGS showed widespread, biologically coherent phenome-wide associations that were reproducible across biobanks and ancestries. Their predictive utility, however, was strongly disease dependent. For asthma, a respiratory factor PGS outperformed an internally derived disease-specific PGS and showed superior cross-ancestry portability, retaining 41.5% of European-ancestry predictive accuracy in African-ancestry individuals, compared with 22.9% for an asthma PGS derived from the largest available multi-ancestry GWAS. By contrast, disease-specific PGS remained superior for coronary artery disease (CAD) and type 2 diabetes (T2D). These findings suggest that phenome-derived aggregation is most beneficial when disease-specific GWAS incompletely capture underlying liability, including settings of biological heterogeneity or imprecise phenotyping. We then evaluated SDoH in AoU as a complementary axis shaping prevalent disease prediction beyond genetic susceptibility. Across all three diseases, SDoH contributed substantial and largely independent predictive information beyond the disease-optimal genetic model. SDoH also modified how genetic liability translated into observed disease prevalence: for asthma and CAD, genetic stratification attenuated with increasing social burden, whereas this attenuation was substantially weaker for T2D. As a result, the same genetic percentile corresponded to different standardized predicted prevalences across social strata, reflecting disease-specific shifts in baseline prevalence, genetic gradients and calibration. Together, these findings indicate that disease risk is shaped by both genetic liability and the social context in which that liability is realized. Phenome-derived PGS improve prediction under specific architectural conditions, whereas social context independently modifies the performance, calibration and interpretation of genetic risk across populations.
Pato, C. N.; Pato, M. T.; Mulle, J.; Hart, R. P.; Pang, Z.; Knowles, J. A.; Singh, T.; Maddhesiya, P.; Carvalho, C.; Merikangas, A.; Medeiros, H.; Bigdeli, T. B.; Kazemi, H.; Drake, J.; Vladimrov, V.; Maher, B.; Bacanu, S.-A.; Neale, B.; Fanous, A.
Show abstract
In an analysis of 173 multiplex families from the Portuguese Island Collection (PIC) we characterize the shared genetic architecture of serious mental illnesses (SMI) including schizophrenia (SZ), bipolar disorder (BP), major depression (MDD), and autism (ASD). Within this cohort, co-segregation of psychotic and mood disorders occurred in 28% of families, while 7% demonstrated co-segregation of intellectual disability or ASD with SZ and mood disorder phenotypes. Whole-genome sequencing (WGS) was performed on a three-generation PIC family to identify rare, large-effect variants. We identified an extremely rare predicted loss of function (LoF) mutation in the Chromodomain Helicase DNA Binding Protein 2 (CHD2) gene. These results demonstrate that high-density multiplex families in founder populations are a powerful resource for mapping rare, large-effect variants that cross clinical diagnostic boundaries, as the identified CHD2 mutation suggests that the disruption of a single neurodevelopmental gene may lead to diverse SMI phenotypes. By combining population and family-based methodologies, this approach leverages shared genetic backgrounds and environments to provide a unique opportunity for cellular studies to explore the biological mechanisms underlying SMI, offering significant potential to inform future functional research and identify novel therapeutic targets.
Sakaue, S.; Yang, D.; Zhang, H.; Posner, D.; Rodriguez, Z.; Love, Z.; Cui, J.; Budu-Aggrey, A.; Ho, Y.-L.; Costa, L.; Monach, P.; Huang, S.; Ishigaki, K.; Melley, C.; Tanukonda, V.; Sangar, R.; Maripuri, M.; Sweet, S. M.; Panickan, V.; McDermott, G.; Hanberg, J. S.; Riley, T.; Laufer, V.; Okada, Y.; Scott, I.; Bridges, S. L.; Baker, J.; VA Million Veteran Program, ; Wilson, P. W.; Gaziano, J. M.; Hong, C.; Verma, A.; Cho, K.; Huffman, J. E.; Cai, T.; Raychaudhuri, S.; Liao, K. P.
Show abstract
Rheumatoid arthritis (RA) is a heritable and common autoimmune condition. To date, most genetic associations were derived from individuals with either European or East Asian ancestries. Here, we applied a multimodal automated phenotyping strategy to define RA and performed a genome-wide association study (GWAS) of RA in the Million Veteran Program (MVP), including underrepresented African American (AFR) and Admixed American (AMR) populations. Meta-analyses with previous RA cohorts identified 152 autosomal genome-wide significant loci, of which 31 were novel. Inclusion of multi-ancestry data dramatically improved fine-mapping resolution. Functional characterization of these loci using single-cell transcriptomic and chromatin data suggested new RA genes such as CHD7 and CD247. We identified underappreciated functional roles of fine-grained immune cell states other than T cells, such as B cell and myeloid cell states. We observed that multi-ancestry polygenic risk scores using our data demonstrated better predictive ability, especially for AFR and AMR populations.
Zhang, N.; Wang, S.; Fu, J.; Ji, Y.; Liu, N.; Qian, Q.; Xue, H.; Ding, H.; Liang, M.; Qin, W.; Xu, J.; Yu, C.
Show abstract
Sex differences are commonly observed in neuroimaging phenotypes and in the risk of brain diseases, yet the underlying genetic mechanisms remain poorly understood. We investigated sex differences in the genetic architecture of 805 neuroimaging phenotypes in 22,950 males and 22,950 females matched for sample size and covariates, and systematically compared sex-stratified with sex-combined genetic analyses. We found eight variant-trait associations with significant sex differences, 235 fine-mapped sex-dominant causal associations, 457 sex-dominant colocalizations with sex hormones, and 96 sex-dominant colocalizations with schizophrenia. Compared with sex-combined analysis, sex-stratified analysis identified 47 new genetic associations, 170 new fine-mapped causal associations, 1,019 new colocalizations with sex hormones, and 191 new colocalizations with schizophrenia. Additionally, sex-stratified analysis improved global heritability and genetic-correlation estimates and enhanced polygenic prediction for certain phenotypes. This work highlights the need to routinely perform sex-stratified genetic association analyses to elucidate sex-specific and sex-shared genetic control of neuroimaging phenotypes and related disorders.
Ding, J.; Kang, H.; Spangenberg, A. L.; Liu, Y.; Martinez, F. D.; Carr, T. F.; Cusanovich, D.
Show abstract
RNA sequencing (RNA-seq) and the Assay for Transposase-Accessible Chromatin using sequencing (ATAC-seq) have become standard techniques for studying gene regulation in human populations. Single-cell (sc) "multiomic" genomic methodologies now enable researchers to dissect cellular heterogeneity while simultaneously measuring gene expression and chromatin accessibility within individual cells. However, single-cell approaches remain experimentally complex and cost-prohibitive, limiting their application in population studies, and motivating the development of new strategies for population-scale single-cell investigations. To this end, we have adapted and optimized a previous multiomic protocol, "Transcriptome, Epitope, and ATAC sequencing" (TEA-seq) through experimentation and simulation to incorporate sample multiplexing, thus resulting in our "multiplexed TEA-seq" (mTEA-seq) protocol. Using mTEA-seq, we sought to determine whether asthma that develops in conjunction with early-life elevated insulin levels might have an identifiable molecular signature. We studied samples from adult individuals (54 subjects, 272,003 cells) from the Tucson Childrens Respiratory Study (TCRS), a birth cohort phenotypically characterized over four decades, to identify unique molecular characteristics of blood cells from asthmatics who had high serum insulin levels at age 6. Using a Bayesian approach, we found striking sex-specific effects. Male asthmatic subjects with high insulin at age 6 displayed widespread immune transcriptional and epigenetic alterations into adulthood compared to male non-asthmatic subjects without elevated insulin at age 6. We also found that male non-asthmatics with early-life high insulin showed epigenetic perturbations in adulthood, but not transcriptional changes. The consistency of epigenetic signals between these two groups that had high insulin at age 6 was highly cell-type-specific. For example, CD14+ monocytes displayed broadly common insulin-associated chromatin remodeling regardless of asthma status, while NK cells exhibited unique patterns of insulin-associated epigenetic reprogramming depending on asthma status. Finally, genotyping performed directly from our single-cell data enabled cell type-specific cis-QTL mapping that suggested HLA-DQB1 and AHI as genes for future study in insulin-associated asthma. Our investigation of childhood insulin-associated asthma demonstrates a metabolically-driven alterations on immune cells persisting into adulthood, thus providing a molecular signature of this asthma subtype, and offering novel insights for disease prevention and therapeutic intervention.
Orozco-Arias, S.; Ferrer-Pomer, I.; Rodrigues de Goes, F.; Gaviria-Orrego, S.; Gomiz-Fernandez, J.; Llatser-Torres, J.; Paschoal, A. R.; Guyot, r.; Gabaldon, T.
Show abstract
Transposable elements (TEs) are major drivers of genome evolution, yet their annotation and classification remain inconsistent and hard to reproduce across species. Fragmented repeats, lineage-specific innovations, and heterogeneous taxonomies across databases and tools complicate comparisons and slow progress in TE biology. To address this, we developed PanTEon, a cross-kingdom deep learning framework for reproducible TE classification that combines a harmonized database with an open, modular benchmarking platform. The PanTEon Database is an automatically curated, taxonomically broad TE repository spanning animals, plants, and fungi. The PanTEon platform standardizes training, evaluation, and inference across nine Machine Learning methods, while remaining extensible to user-defined architectures. Using this framework, we benchmark state-of-the-art Machine Learning-based TE classifiers across TE superfamilies and major eukaryotic lineages and find that performance varies markedly by kingdom and superfamily. Ensemble approaches and phylum-specific models improve predictive F1 scores, but cross-species generalization remains a major challenge. Together, PanTEon Database and PanTEon platform provide a reproducible, scalable, and extensible foundation for TE classification, enabling standardized evaluation of future AI methods and supporting community-driven annotation efforts.
Carver, S.; Perea-Chamblee, T.; Taraszka, K.; Moon, I.; Yu, X.; Ding, Y.; Carrot-Zhang, J.; Gusev, A.
Show abstract
Genome-wide association studies (GWAS) have advanced the understanding of germline susceptibility in common cancers, yet rare malignancies remain underexplored due to limited sample sizes. To address this gap, we conducted large-scale GWAS across 20 rare cancer types and meta-analyzed results from three cohorts: two clinically sequenced cancer center cohorts and an independent population biobank, comprising over 480,000 individuals. We identified nine novel genome-wide significant susceptibility loci with moderate to large effect sizes that replicated across cohorts in eight rare malignancies, including myelodysplastic syndromes (MDS), germ cell tumors, gastrointestinal stromal tumor (GIST), gastrointestinal neuroendocrine tumors, anal cancer (ANSC), non-melanoma skin cancer, mesothelioma, and hepatobiliary cancer. Among the strongest associations were loci in MDS near API5 (OR = 2.21, p = 1.06x10-8), in GIST near SLC6A18 and TERT (OR = 1.91, p = 8.20x10-50), and in ANSC near HLA-DQA2 (OR = 1.58, p = 5.50x10-18). The GIST risk variant was enriched in tumors harboring somatic KIT mutations (OR = 2.21, p = 6.5x10-4) and was associated with worse survival among carriers with KIT-mutant tumors (hazard ratio = 4.06, p = 0.015), implicating germline-somatic interplay in tumor initiation and progression. The ANSC risk variant was associated with HPV infection (OR = 1.44, p = 3.19x10-5), supporting a host-viral interaction in HPV-driven tumorigenesis. The MDS risk variant at the API5 locus was associated with altered neutrophil counts, suggesting a role in hematopoietic dysregulation in disease pathogenesis. We further identified novel, independent associations with mesothelioma, GIST, and hepatobiliary cancer at the 5p15.33 locus encompassing TERT, consistent with pleiotropic genetic effects at a core telomere-maintenance gene. Collectively, these findings demonstrate that integrating clinically ascertained sequencing cohorts with population biobanks substantially enhances germline discovery in rare cancers, enabling identification of high-confidence susceptibility loci and facilitating downstream biological interpretation through linked somatic, viral, and clinical data. This framework provides a scalable approach for characterizing inherited susceptibility across diverse rare malignancies.
Wang, B.; Wan, S.; Zhang, P.; Zhang, Y.; Wang, X.; Dong, L.; Ye, K.; Yang, X.
Show abstract
The complete assembly of the human Y chromosome remains a challenge due to its highly repetitive and complex structure. While complete telomere-to-telomere (T2T) assemblies have been generated for a few individuals, such high-quality resources for East Asian populations, particularly for well-characterized multi-omics reference cohorts, are still scarce. The Chinese Quartet, comprising monozygotic twin daughters and their parents, is a premier reference material for genomic studies, yet a T2T-level Y chromosome assembly for this pedigree was lacking. Here, we present a complete, gapless T2T assembly of the Y chromosome (designated CQ-chrY) from the father of the Chinese Quartet. This assembly was generated by integrating Oxford Nanopore ultra-long reads, PacBio HiFi reads, and Hi-C data, resulting in a sequence of 61.88 Mb. The assembly shows exceptional base accuracy (QV = 51.09) and structural completeness (GCI = 100; CRAQ AQI = 95.217). We completely resolved the 33.52 Mb Yq12 heterochromatic region and annotated 164 protein-coding genes and 51.03 Mb (82.47%) of repetitive sequences. This CQ-chrY assembly represents the third complete Chinese Y chromosome and fills the last gap in the T2T assemblies of the Quartet family, providing an invaluable paternal haplotype resource for expanding East Asian genomic standards and for studies on Y chromosome structural variation and evolution.
Cataldo-Ramirez, C.; Lin, M.; McMahon, A.; Gignoux, C.; Weaver, T. D.; Henn, B. M.
Show abstract
Genome-wide association studies (GWAS) and polygenic score (PGS) development are typically constrained by the data available in biobank repositories in which European cohorts are vastly overrepresented. Here, we increase the utility of non-European participant data within the UK Biobank (UKB) by characterizing the genetic affinities of UKB participants who self-identify as Bangladeshi, Indian, Pakistani, "White and Asian" (WA), and "Any Other Asian" (AOA), towards creating a more robust South Asian sample size for future genetic analyses. We assess the relationships between genetic structure and self-selected ethnic identities and use consistent patterns of clustering in the dataset to train a support vector machine (SVM). The SVM was utilized to reassign n = 1,853 AOA and WA participants at the subcontinental level, and increase the sample size of the UKB South Asian group by 1,381 additional participants. We further leverage these samples to assess GWAS performance and PGS development. We include environmental covariates in the height GWAS by implementing a rigorous covariate selection procedure, and compare the outputs of two GWAS models: GWASnull and GWASenv. We show that PGS performance derived from both GWAS models yield comparable prediction to PGS models developed with an order of magnitude larger training, and environmentally-adjusted PGS models reduce the sex-bias in predictive performance. In summary, we demonstrate how GWAS performance can be improved by leveraging ambiguous ethnicity codes, ancestry matched imputation panels, and including environmental covariates.
Bandaru, R.; Fu, H.; Zheng, H.; Liang, J.; Wang, L.; Gulati, S.; Hinrichs, B. H.; Teng, M.; Zhang, B.; Kocherginsky, M.; Lin, D.; Hildeman, D. A.; Worden, F. P.; Old, M. O.; Dunlap, N. E.; Kaczmar, J. M.; Gillison, M.; El-Gamal, D.; Wise-Draper, T.; Liu, Y.
Show abstract
Reliable, minimally invasive biomarkers for predicting immunotherapy response in head and neck squamous cell carcinoma (HNSCC) remain an unmet clinical need. Here, using patients from a prospective, multi-institutional phase II clinical trial (NCT02641093), we performed whole genome sequencing of 185 plasma cell-free DNA (cfDNA) samples collected longitudinally from 68 patients with locally advanced, surgically resectable HNSCC undergoing neoadjuvant and adjuvant pembrolizumab treatment. We developed the regional motif diversity score (rMDS), a novel fragmentomic metric quantifying the entropy of cfDNA 5' end motifs across genomic regions. Remarkably, unsupervised analysis revealed that rMDS robustly distinguished immunotherapy responders from non-responders, outperforming established cfDNA fragmentomic metrics and copy number alterations, while demonstrating independence from technical confounders. Longitudinal analysis revealed dynamic rMDS changes in genomic regions enriched for immune, lectin, and keratinization-related genes, hallmarks of squamous cell carcinoma, reflecting the interplay between tumor and peripheral immunity during the immunotherapy treatment. Interestingly, the regions with the most dynamic rMDS changes were highly enriched in telomere proximal loci, suggesting a novel link between telomere biology and cfDNA fragmentation. A machine learning classifier based on rMDS achieved robust predictive performance across multiple validation settings (AUC 0.89-0.99), with the highest accuracy at post-treatment timepoints and superior to PD-L1 expression and tumor fraction in the same sample. Predicted responders demonstrated significant trends toward improved disease-free survival (log rank test p=0.035, hazard ratio: 2.67, 95% confidence interval: 1.03-6.92), underscoring the clinical utility of rMDS-based stratification. These findings position rMDS as a biologically meaningful and clinically actionable biomarker for immunotherapy response in HNSCC, supporting its integration into future risk assessment frameworks and broader cancer care.
Pham, C. V. K.; Abdelmalek, F. S. A.; Hua, T.; Apel, E.; Bizjak, A.; Schmidt, E. J.; Houlahan, K. E.
Show abstract
Commonly used human reference genomes collapse extensive genetic variability into a single linear genome of which 70% is derived from one donor. These linear genomes fail to capture the full spectrum of genetic variation, which can lead to misalignment of sequencing reads particularly for individuals underrepresented by the linear reference genomes. To address this shortcoming, the Human Pangenome Reference Consortium released the first draft of the human pangenome reference, a graph-based reference that integrates diverse haplotypes. While the human pangenome reference has shown increased accuracy in detecting inherited DNA variants, it remains to be seen if the observed improvements extend to somatic mutation detection. Here, we systematically benchmarked somatic single nucleotide variant (SNV) detection leveraging the human pangenome in 30 whole exome sequenced bladder tumours with matched blood tissue of diverse ancestries. We found somatic SNV detection leveraging the human pangenome reference outperformed the linear reference, most notably in individuals of East Asian ancestry where we observed on average a 20% improvement in detection accuracy. Improvements to detection accuracy in individuals of European ancestry were marginal. The increase in accuracy was attributed to reduced germline contamination and reduced reference bias. Further, we demonstrate the pangenome increases SNV detection precision, mitigating the need for time and computationally expensive ensemble approaches that take the consensus across multiple tools. Finally, we demonstrate that the increased precision when aligned to the pangenome generalized to an additional 29 lung adenocarcinoma tumours, particularly for individuals of East Asian ancestry. These findings support adoption of the pangenome to improve somatic variant detection and reduce ancestry-related disparities.
Gragert, L.; Madbouly, A.; Bashyal, P.; Wadsworth, K.; Kempenich, J.; Bolon, Y.-T.; Maiers, M.
Show abstract
The human leukocyte antigen (HLA) system is the primary determinant of donor selection in allogeneic hematopoietic cell transplantation (HCT) and plays a central role in solid organ transplantation, immune-mediated disease studies, evolutionary population genetics, and immunotherapy. Large-scale sampling of registry participants reflecting major US ancestry groups allows for characterization of the complex landscape of HLA haplotype diversity for the classical HLA class I (HLA-A, HLA-B, HLA-C) and HLA class II (HLA-DRB1, HLA-DRB3, HLA-DRB4, HLA-DRB5, HLA-DQA1, HLA-DQB1, HLA-DPA1, and HLA-DPB1) genes. Here we present nine-locus classical HLA allele and haplotype frequency estimates for five broad (Black, White, Asian or Pacific Islander, Hispanic and Native American) and 21 detailed US populations based on 9,671,082 donors with targeted genotyping by DNA-based methods. Frequency estimation used an expectation-maximization (EM) framework specifically adapted to handle mixed-resolution and ambiguous HLA genotyping data. Advancements in next-generation sequencing provide extensive HLA genotyping, offering new insights into the haplotype structure and diversity of the human MHC complex, expanding knowledge especially for HLA class II haplotypes. Population analyses reveal that the most common high-resolution haplotypes are predominantly population-specific, with only three haplotypes shared across the top-100 lists of all five broad population groups, and that Black populations exhibit the greatest nine-locus haplotypic diversity, a pattern that persists after controlling for differences in registry sample size. These frequencies, derived from the largest US cohort to date, support clinical decision-making and research in histocompatibility, immunogenetics, and transplantation and are publicly available at https://zenodo.org/records/17966993.
Song, H.; Xu, J.; Velazquez-Arcelay, K.; Demirci, A.; Raizenne, B. L.; Hsu, S. C.; Choi, J.; Pham, J. H.; Chen, Y.-A.; Weinstein, H. N. W.; Salzman, I.; Tsui, M.; Akutagawa, J.; Adingo, W.; Goldschmidt, E.; Carroll, P. R.; Hong, J. C.; Heaphy, C. M.; Cooperberg, M. R.; Greenland, N.; Campbell, J. D.; Huang, F. W.
Show abstract
Prostate cancer encompasses a spectrum of disease states driven by complex cellular heterogeneity. To delineate the transcriptional programs underlying lineage plasticity and metastasis, we constructed a comprehensive single-cell atlas of 128 patients, spanning localized, castration-resistant, and metastatic disease. Lineage plasticity was prevalent in localized disease, with subsets of tumor cells adopting distinct basal-like and club-like states. Luminal-like cancer cells also displayed extensive lineage infidelity, defined not by a binary loss of identity but by the combinatorial erosion of luminal gene modules associated with higher grade and stage. In the metastatic setting, gene program association analysis (GPAS) identified a broad induction of cell-cycle gene modules across organ sites as well as an induction of organ-specific gene modules, including osteomimetic signaling in bone, neuro-migratory genes in brain, and erythroid-like transitions in liver. Neuroendocrine prostate cancers (NEPCs) were not monolithic but defined by combinations of NE-associated gene modules including a novel HES6 program. Notably, these modules were detected at intermediate levels in localized samples, suggesting molecular plasticity precedes histological transformation. We also developed a refined NE signature that could distinguish NEPC tumors more accurately than previously published signatures. Within the tumor microenvironment (TME), we observed an elevation of pro-inflammatory Th17 T-cells in African American patients and identified a rare Schwann cell population. Finally, we present PCformer, a transformer-based foundation model trained on >500,000 cells to automate cell-state classification. Together, this comprehensive atlas demonstrates the complex nature of gene modules underlying lineage infidelity and plasticity in cancer cells and highlights distinct immune and stromal populations within the tumor ecosystem.
Zehra, B.; BinEshaq, S.; Faizan, M.; Eldesouky, M.; Vinod, N.; Mohamed, N.; Vijayakumar, A.; Aleksandrova, I.; Tambi, R.; Sabeel, S.; Advani, D.; Hashmi, A.; Al-Shaibani, S.; Almarri, M.; Nassir, N.; Almansoori, S.; Du Plessis, S.; Uddin, M.; Berdiev, B.
Show abstract
Sudden unexpected death in epilepsy (SUDEP) is the most devastating complication of epilepsy, yet the molecular features distinguishing individuals at risk remain poorly defined. Although epilepsy and SUDEP share substantial genetic overlap, fatal outcomes may arise when shared risk genes are differentially deployed across neuronal and cardiac systems. Here, we identify tissue- and isoform-level regulation as a key determinant of divergence between epilepsy and SUDEP risk. We performed a large-scale integrated analysis of genetic variants reported in epilepsy and SUDEP across 419 sequencing-based studies encompassing 35,659 individuals, and quantified gene-level burden using a Bayesian Poisson-Gamma rate ratio framework. This analysis revealed preferential enrichment of genes related to cardiac electrophysiology and contractile function in SUDEP, whereas epilepsy was dominated by genes involved in neuronal excitability and synaptic signaling. To determine how shared genetic loci are deployed across tissues, we integrated GTEx-based tissue expression profiles with long-read single-cell transcriptomic datasets from human heart and brain to resolve isoform-level expression patterns. These analyses revealed pronounced tissue-specific transcript architectures. Cardiac-associated genes, including HCN4, KCNH2, KCNE1, MYH6, MYO18B, and ATP1A2, showed heart-restricted isoform expression, whereas neuronal genes such as ADGRV1, CACNA1A, GRIN2B, HCN1, HCN2, KCNA1, SCN1A, SCN2A, and SCN8A. Importantly, several shared genes exhibited tissue-partitioned isoform expression, with distinct transcript repertoires in heart and brain, particularly across pathways related to ion transport, signaling, metabolism, and structural organization. Consistent patterns were observed in iPSC-derived cardiomyocytes and neurons, indicating that lineage-dependent deployment of shared genes is preserved in controlled systems. Together, these findings suggest that tissue-specific isoform regulation provides a mechanistic basis linking shared epilepsy genetics to SUDEP susceptibility, whereby the same genetic loci contribute to neuronal dysfunction in epilepsy and to cardiac vulnerability in SUDEP. This positions SUDEP as a neuro-cardiac interface disorder shaped by isoform-level regulatory divergence.
Sen, S.; Esteve, P. O.; Tarasia, D.; Dannenberg, R.; Dey, A.; Maulik, U.; Pradhan, S.; Bandyopadhyay, S.
Show abstract
Epigenetic enzymes, writers, readers and erasers regulate chromatin landscapes and participate in tumor heterogeneity. While therapeutic targeting of these enzymes has shown clinical promise, the comparative efficacy of mono-versus dual-inhibitor strategies remain unclear. Here, we introduce a multi-modal platform that uses NicE-viewSeq and integrates automated deep learning based spatially resolved chromatin accessibility profiling with high-throughput sequencing following epigenetic inhibitor application. Accessible chromatin landscapes were altered along with nucleosome positioning following inhibition of either LSD1 or HDACs alone, or both together. Coordinated modulation of histone marks and the CoREST complex on chromatin was observed across inhibitory conditions. Transcription factor binding analysis identified three predominant families, ETS, RUNT, and bZIP with enhanced chromatin association upon treatments. Mechanistically, a CoREST-RUNX regulatory axis was uncovered wherein JunB, a member of bZIP family displaces CoREST-RUNX at differentially accessible regions, triggering apoptotic pathways. Therefore, JunB-mediated mechanism reveals a convergent therapeutic vulnerability, offering new avenues for optimizing different combinatorial epigenetic therapy in cancer.
Shi, Z.; Zhang, Z.; Mandla, R.; Hou, K.; Pasaniuc, B.
Show abstract
Polygenic scores (PGS) have emerged as a useful biomarker for stratification of high-risk individuals in genomic medicine, with prediction intervals arising as a principled approach to incorporate statistical uncertainty in their individual-level predictions. In contrast to recent reports by Xu et al7, we show that CalPred6 provides well-calibrated prediction intervals that contain the trait phenotypes at targeted confidence levels. CalPred maintains calibration when PGS performance varies across contextual factors (e.g., ancestry, age, sex, or socio-economic factors) whereas PredInterval7 - a recently introduced method that focuses on marginal calibration across all individuals - exhibits miscalibration.
Fraemke, D.; Paulus, L.; Schuurmans, I.; Walter, J.- H.; Czamara, D.; Schowe, A. M.; deSteiguer, A.; Tanksley, P. T.; Okbay, A.; Moenkediek, B.; Instinske, J.; Noethen, M. M.; Disselkamp, C. K. L.; Forstner, A. J.; Binder, E. B.; Kandler, C.; Spinath, F. M.; Lindenberger, U.; Malanchini, M.; Cecil, C. A. M.; Mitchell, C.; Harden, K. P.; Tucker-Drob, E. M.; Raffington, L.
Show abstract
Large-scale genomic studies have identified biomarkers of adult cognitive functioning and educational attainment, yet the developmental pathways connecting these biomarkers to adult outcomes remain unclear. Drawing on four cohorts, we examined the developmental correlates of an epigenetic index of adult cognitive function ( Epigenetic-g) alongside polygenic indices of cognition and education. Epigenetic-g and polygenic indices were uncorrelated and captured distinct variation in childrens cognitive and academic performance. Longitudinal analyses revealed that Epigenetic-g is plastic in early childhood, reaching moderate stability by adolescence, and, unlike polygenic indices, is not related to longitudinal cognitive growth. Twin models indicated that Epigenetic-g captures genetic and unique environmental variation relevant to cognitive and academic achievement that is not identified by current polygenic indices. Epigenetic indices relevant to psychological development can be generated from DNA methylation studies of adults, with most variation in these indices emerging early in life.
Jacobsen, J. T.; Moller, P. L.; Rohde, P. D.
Show abstract
Genomics offer a powerful approach to identify causal mechanisms underlying coronary artery disease (CAD) risk, with implications for pathogenesis, personalized prevention strategies, and therapeutic target discovery. Functionality-informed genome-wide fine mapping was performed using the Bayesian framework SBayesRC to estimate genetic contributions of 6.9 million common variants, based on GWAS summary statistics from over one million individuals of European ancestry. Causal candidate genes were prioritized in a 5kB flanking window within high-confidence local credible sets (LCSs). Their downstream biological influence was analyzed using protein-protein interaction networks and pathway enrichment analyses across three complimentary dimensions: molecular, cellular, and disease level. Genetic modeling captured the highly polygenic architecture of CAD, estimating on average 34,000 variants to contribute to CAD risk, explaining 3.8% of total phenotypic variance. 36 high-confidence variants (PIP > 0.9) collectively explained 13.6% of genetic variance, while most variants demonstrated small individual effects but with substantial collective contributions. 17,150 variants were prioritized within 581 high-confidence LCSs, of which 195 were annotated to genes and 170 were implicated in downstream pathway analyses. The three most influential variants were mapped to PHACTR1, APOE, and LPL, explaining 2.49%, 1.59%, and 1.46% of genetic variance respectively. Pathway analyses revealed that genetic risk in CAD is driven by dysregulation of three interlinked biological processes: 1) lipoprotein function and cholesterol metabolism, 2) vascular homeostasis, and 3) cellular stress responses and inflammation. These findings advance the causal understanding of CAD pathogenesis, supporting the transition from association-based to functionality-informed genomic approaches in cardiovascular genetics.
Yuan, H.; Mandava, A.; Sarmart, K.; Ganz, J.; Krishnan, A.
Show abstract
Genome-wide association studies (GWAS) have implicated thousands of loci in complex diseases, but translating these population-level signals into specific cellular contexts remains a central challenge. Integrating GWAS with single-cell transcriptomics data has enabled systematic identification of disease-relevant cell types, yet existing methods face a fundamental tradeoff: approaches like seismic that optimized for statistical power operate at the annotated cell-type level and miss heterogeneous disease signals concentrated in specific cellular states, while single-cell-resolution approaches like scDRS that capture such heterogeneity often lack sufficient power to detect subtle associations. Here we present ICePop (Informative Cell Populations), a framework that resolves this tradeoff by performing disease-cell type association at metacell resolution, thus achieving statistical power comparable to cell-type-level methods while detecting heterogeneous disease signals within cell types. In simulations against seismic and scDRS, ICePop maintains appropriate false positive rates and demonstrates superior power when disease effects are concentrated in cellular subpopulations. Applied to Tabula Muris across 81 traits and 120 cell types, ICePop identifies 2,178 disease-cell type associations, including the preferential vulnerability of differentiated gut epithelial cells in ulcerative colitis and loss of cell identity in immune-stressed lung capillary endothelial cells underlying their association with lung function. Clustering diseases by metacell association profiles reveals groupings that diverge from genetic risk-based clustering, including separation of blood cell count traits from immune diseases despite shared genetic architecture, reflecting differences in cellular rather than genetic etiology. In autism spectrum disorder, ICePop identifies preferential enrichment of genetic risk in specific enteric neuron subtypes, implicating dysfunction of the enteric nervous system in gastrointestinal comorbidities. ICePops resolution of disease-relevant cell states within annotated cell types enables generation of testable, cell-state-specific hypotheses about disease mechanisms and therapeutic targets.
Childers, I. R.; Foley, N. M.; Bredemeyer, K. R.; Murphy, W. J.
Show abstract
Meiotic recombination is a crucial biological process that ensures proper chromosomal pairing and promotes adaptation. In placental mammals, recombination rates vary widely across species, populations, sexes, individuals, and chromosomes. While the placental X chromosome shows remarkable conservation of both gene order and the recombination landscape across deep evolutionary history, it is unknown whether similar levels of autosomal conservation persist despite extensive chromosomal evolution. Here, we reconstructed an ancestral placental mammal karyotype from chromosome-level assemblies, using slow rates of karyotypic evolution, and inferred an ancestral autosomal recombination map. Analysis of phylogenetic branch lengths and PhyloP-based scores of evolutionary constraint reveals that conserved autosomal regions with low recombination rates have evolved under stronger purifying selection, whereas regions with conserved high recombination rates are less constrained and freer to evolve. Ancestral autosomal regions with low recombination rates were enriched for pathways and GO terms related to cellular function, whereas ancestral regions with high recombination rates were enriched for regulation and some immune-related systems. Tracking the fate of these conserved ancestral recombination hotspots and coldspots across 13 mammal lineages with variable rates of karyotype evolution revealed the retention of autosomal AHRs, but the absence of autosomal ALR conservation. Collectively, our findings reveal variable levels of evolutionary constraint at meiotic recombination in relation to karyotypic evolution, providing new insights into how natural selection influences the evolution of chromosomal organization.